智能论文笔记

Variational Inference for Additive Main and Multiplicative Interaction Effects Models

AntÔnia A. L. Dos Santos , Rafael A. Moral , Danilo A. Sarti , Andrew C. Parnell

分类： (统计)机器学习 | 机器学习

2022-06-29

在植物繁殖中，环境（GXE）相互作用的基因型存在对耕作决策和引入新作物品种的影响很大。线性和双线性项的组合已被证明在建模这种类型的数据方面非常有用。识别GXE的一种广泛使用的方法是加性主要效应和乘法交互作用（AMMI）模型。但是，由于数据经常可能是高维的，马尔可夫链蒙特卡洛（MCMC）方法在计算上可能是不可行的。在本文中，我们考虑了这种模型的变异推理方法。我们得出用于估计参数的变异近似值，并使用模拟和真实数据将近似值与MCMC进行比较。我们提出的新推论框架平均要快两倍，同时保持与MCMC相同的预测性能。

translated by 谷歌翻译

Rapid Extraction of Respiratory Waveforms from Photoplethysmography: A Deep Encoder Approach

Harry J. Davies , Danilo P. Mandic

分类：机器学习

2022-12-22

Much of the information of breathing is contained within the photoplethysmography (PPG) signal, through changes in venous blood flow, heart rate and stroke volume. We aim to leverage this fact, by employing a novel deep learning framework which is a based on a repurposed convolutional autoencoder. Our model aims to encode all of the relevant respiratory information contained within photoplethysmography waveform, and decode it into a waveform that is similar to a gold standard respiratory reference. The model is employed on two photoplethysmography data sets, namely Capnobase and BIDMC. We show that the model is capable of producing respiratory waveforms that approach the gold standard, while in turn producing state of the art respiratory rate estimates. We also show that when it comes to capturing more advanced respiratory waveform characteristics such as duty cycle, our model is for the most part unsuccessful. A suggested reason for this, in light of a previous study on in-ear PPG, is that the respiratory variations in finger-PPG are far weaker compared with other recording locations. Importantly, our model can perform these waveform estimates in a fraction of a millisecond, giving it the capacity to produce over 6 hours of respiratory waveforms in a single second. Moreover, we attempt to interpret the behaviour of the kernel weights within the model, showing that in part our model intuitively selects different breathing frequencies. The model proposed in this work could help to improve the usefulness of consumer PPG-based wearables for medical applications, where detailed respiratory information is required.

translated by 谷歌翻译

POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image and Time Series Classification

Andrea Falanti , Eugenio Lomurno , Danilo Ardagna , Matteo Matteucci

分类：机器学习 | 人工智能 | 计算机视觉 | 神经与进化计算

2022-12-13

The automated machine learning (AutoML) field has become increasingly relevant in recent years. These algorithms can develop models without the need for expert knowledge, facilitating the application of machine learning techniques in the industry. Neural Architecture Search (NAS) exploits deep learning techniques to autonomously produce neural network architectures whose results rival the state-of-the-art models hand-crafted by AI experts. However, this approach requires significant computational resources and hardware investments, making it less appealing for real-usage applications. This article presents the third version of Pareto-Optimal Progressive Neural Architecture Search (POPNASv3), a new sequential model-based optimization NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The algorithm employs Pareto optimality to reduce the number of architectures sampled during the search, drastically improving the time efficiency without loss in accuracy. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.

translated by 谷歌翻译

PL-$k$NN: A Parameterless Nearest Neighbors Classifier

Danilo Samuel Jodas , Leandro Aparecido Passos , Ahsan Adeel , João Paulo Papa

分类：机器学习

2022-09-26

需要在机器学习模型中对最小参数设置的需求，以避免耗时的优化过程。$ k $ - 最终的邻居是在许多问题中使用的最有效，最直接的模型之一。尽管具有众所周知的性能，但它仍需要特定数据分布的$ K $值，从而需要昂贵的计算工作。本文提出了一个$ k $ - 最终的邻居分类器，该分类器绕过定义$ k $的值的需求。考虑到训练集的数据分布，该模型计算$ k $值。我们将提出的模型与标准$ K $ - 最近的邻居分类器和文献中的两个无参数版本进行了比较。11个公共数据集的实验证实了所提出方法的鲁棒性，因为所获得的结果相似甚至更好。

translated by 谷歌翻译

Low Cost Embedded Vision System For Location And Tracking Of A Color Object

Diego Ayala , Danilo Chavez , Leopoldo Altamirano Robles

分类：计算机视觉

2022-07-28

本文介绍了用于检测，位置和跟踪颜色对象的嵌入式视觉系统的开发；它利用单个32位微处理器来获取图像数据，过程并根据解释的数据执行操作。该系统旨在用于需要使用人工视觉进行检测，位置和跟踪颜色对象的应用程序，其目标是以降低的规模，功耗和成本的范围实现。

translated by 谷歌翻译

A Medical Information Extraction Workbench to Process German Clinical Text

Roland Roller , Laura Seiffe , Ammer Ayach , Sebastian Möller , Oliver Marten , Michael Mikhailov , Christoph Alt , Danilo Schmidt , Fabian Halleck , Marcel Naik

分类：自然语言处理

2022-07-08

背景：在信息提取和自然语言处理域中，可访问的数据集对于复制和比较结果至关重要。公开可用的实施和工具可以用作基准，并促进更复杂的应用程序的开发。但是，在临床文本处理的背景下，可访问数据集的数量很少 - 现有工具的数量也很少。主要原因之一是数据的敏感性。对于非英语语言，这个问题更为明显。方法：为了解决这种情况，我们介绍了一个工作台：德国临床文本处理模型的集合。这些模型接受了德国肾脏病报告的识别语料库的培训。结果：提出的模型为内域数据提供了有希望的结果。此外，我们表明我们的模型也可以成功应用于德语的其他生物医学文本。我们的工作台公开可用，因此可以开箱即用，或转移到相关问题上。

translated by 谷歌翻译

Resource Allocation in Multicore Elastic Optical Networks: A Deep Reinforcement Learning Approach

Juan Pinto-Ríos , Felipe Calderón , Ariel Leiva , Gabriel Hermosilla , Alejandra Beghelli , Danilo Bórquez-Paredes , Astrid Lozada , Nicolás Jara , Ricardo Olivares , Gabriel Saavedra

分类：机器学习 | 人工智能

2022-07-05

第一次采用了深入的增强学习方法来解决动态多核心纤维弹性光学网络（MCF-eons）中的路由，调制，频谱和核心分配（RMSCA）问题。为此，设计和实施了一个与OpenAI的健身房兼容的新环境，以模仿MCF -eons的运行。新的环境通过考虑网络状态和与物理层相关的方面来处理代理操作（选择路线，核心和频谱插槽）。后者包括可用的调制格式及其覆盖范围以及与MCF相关的障碍的核心间串扰（XT）。如果信号的产生质量是可以接受的，则环境将分配代理选择的资源。处理代理的操作后，环境被配置为为代理提供有关新网络状态的数值奖励和信息。通过仿真将四个不同药物的阻塞性能与MCF-eons中使用的3个基线启发式方法进行了比较。 NSFNET和COST239网络拓扑获得的结果表明，表现最佳的代理平均而言，在阻止最佳性基线启发式方法方面，最多可降低四倍的降低。

translated by 谷歌翻译

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

Yichong Leng , Zehua Chen , Junliang Guo , Haohe Liu , Jiawei Chen , Xu Tan , Danilo Mandic , Lei He , Xiang-Yang Li , Tao Qin

分类：机器学习

2022-05-30

Binaural audio plays a significant role in constructing immersive augmented and virtual realities. As it is expensive to record binaural audio from the real world, synthesizing them from mono audio has attracted increasing attention. This synthesis process involves not only the basic physical warping of the mono audio, but also room reverberations and head/ear related filtrations, which, however, are difficult to accurately simulate in traditional digital signal processing. In this paper, we formulate the synthesis process from a different perspective by decomposing the binaural audio into a common part that shared by the left and right channels as well as a specific part that differs in each channel. Accordingly, we propose BinauralGrad, a novel two-stage framework equipped with diffusion models to synthesize them respectively. Specifically, in the first stage, the common information of the binaural audio is generated with a single-channel diffusion model conditioned on the mono audio, based on which the binaural audio is generated by a two-channel diffusion model in the second stage. Combining this novel perspective of two-stage synthesis with advanced generative models (i.e., the diffusion models),the proposed BinauralGrad is able to generate accurate and high-fidelity binaural audio samples. Experiment results show that on a benchmark dataset, BinauralGrad outperforms the existing baselines by a large margin in terms of both object and subject evaluation metrics (Wave L2: 0.128 vs. 0.157, MOS: 3.80 vs. 3.61). The generated audio samples (https://speechresearch.github.io/binauralgrad) and code (https://github.com/microsoft/NeuralSpeech/tree/master/BinauralGrad) are available online.

translated by 谷歌翻译

From data to functa: Your data point is a function and you can treat it like one

Emilien Dupont , Hyunjik Kim , S. M. Ali Eslami , Danilo Rezende , Dan Rosenbaum

分类：机器学习

2022-01-28

It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa. This view presents a number of challenges around efficient conversion from data to functa, compact representation of functa, and effectively solving downstream tasks on functa. We outline a recipe to overcome these challenges and apply it to a wide range of data modalities including images, 3D shapes, neural radiance fields (NeRF) and data on manifolds. We demonstrate that this approach has various compelling properties across data modalities, in particular on the canonical tasks of generative modeling, data imputation, novel view synthesis and classification. Code: https://github.com/deepmind/functa

translated by 谷歌翻译

Gait Recognition Based on Deep Learning: A Survey

Claudio Filipi Gonçalves dos Santos , Diego de Souza Oliveira , Leandro A. Passos , Rafael Gonçalves Pires , Daniel Felipe Silva Santos , Lucas Pascotti Valem , Thierry P. Moreira , Marcos Cleison S. Santana , Mateus Roder , João Paulo Papa

分类：计算机视觉 | 机器学习

2022-01-10

通常，基于生物谱系的控制系统可能不依赖于各个预期行为或合作适当运行。相反，这种系统应该了解未经授权的访问尝试的恶意程序。文献中提供的一些作品建议通过步态识别方法来解决问题。这些方法旨在通过内在的可察觉功能来识别人类，尽管穿着衣服或配件。虽然该问题表示相对长时间的挑战，但是为处理问题的大多数技术存在与特征提取和低分类率相关的几个缺点，以及其他问题。然而，最近的深度学习方法是一种强大的一组工具，可以处理几乎任何图像和计算机视觉相关问题，为步态识别提供最重要的结果。因此，这项工作提供了通过步态认可的关于生物识别检测的最近作品的调查汇编，重点是深入学习方法，强调他们的益处，暴露出弱点。此外，它还呈现用于解决相关约束的数据集，方法和体系结构的分类和表征描述。

translated by 谷歌翻译